FPGA architecture for pairwise statistical significance estimation
نویسندگان
چکیده
Sequence comparison is one of the most fundamental computational problems in bioinformatics. Pairwise sequence alignment methods align two sequences using a substitution matrix consisting of pairwise scores of aligning different residues with each other (like BLOSUM62), and give an alignment score for the given sequence-pair. This work 1 addresses the problem of accurately estimating statistical significance of pairwise alignment for the purpose of identifying related sequences, by making the sequence comparison process more sequencespecific. Specifically, we develop algorithms for sequence-specific strategies for hardware acceleration of pairwise sequence alignment in conjunction with statistical significance estimation. Using pairwise statistical significance has been shown to give better retrieval accuracy compared to database statistical significance reported by popular database search programs like BLAST and PSIBLAST. We provide a ‘flexible array’ hardware architecture, which provides a scalable systolic array suitable for both long and short sequences. The results with Xtremedata XD1000 FPGA platform show a speed-up by up to a factor of more than 200.
منابع مشابه
Efficient Pairwise Statistical Significance Estimation using FPGAs
In this paper, we present a fast pairwise statistical significance estimator using a Field Programmable Gate Array (FPGA) coprocessor. The running time of the pairwise statistical significance estimation routine is dominated by the hundreds of local alignments it must compute. By offloading the alignment task to an accelerator designed to concurrently process multiple independent alignments, we...
متن کاملHigh Performance Sequence Mining Using Pairwise Statistical Significance
With the amount of sequence data deluge as a result of next generation sequencing, there comes a need to leverage the large-scale biological sequence data. Therefore, the role of high performance computational methods to mining interesting information solely from these sequence data becomes increasingly important. Almost everything in bioinformatics counts on the inter-relationship between sequ...
متن کاملEnhancing Parallelism of Pairwise Statistical Significance Estimation for Local Sequence Alignment
Pairwise statistical significance (PSS) has been found to be able to accurately identify related sequences (homology detection), which is a fundamental step in numerous applications relating to sequence analysis. Although more accurate than database statistical significance, it is both computationally intensive and data intensive to construct the empirical score distribution during the estimati...
متن کاملمدل عملکردی تحلیلی FPGA برای پردازش با قابلیت پیکربندی مجدد
Optimizing FPGA architectures is one of the key challenges in digital design flow. Traditionally, FPGA designers make use of CAD tools for evaluating architectures in terms of the area, delay and power. Recently, analytical methods have been proposed to optimize the architectures faster and easier. A complete analytical power, area and delay model have received little attention to date. In addi...
متن کاملEstimating Pairwise Statistical Significance of Protein Local Alignments Using a Clustering-Classification Approach Based on Amino Acid Composition
A central question in pairwise sequence comparison is assessing the statistical significance of the alignment. The alignment score distribution is known to follow an extreme value distribution with analytically calculable parameters K and λ for ungapped alignments with one substitution matrix. But no statistical theory is currently available for the gapped case and for alignments using multiple...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IJHPSA
دوره 4 شماره
صفحات -
تاریخ انتشار 2013